The final inquiry question our team landed on is to investigate which states are purchasing the most research-related drugs, and whether there is a correlation between this and the socio-economic status of that state.
As the unemployment rate of a state increases, the total amount spent on research-related prescriptions increases.
finding1.show()
This finding may seem to come with plenty of outliers. Indeed, if it were not for these few outlying states — California, Texas, Florida, New York, Massachusetts and North Carolina — the data would seem to follow a strong positive trend.
These 6 states remain troubling for the entirety of our analysis. They are outliers in every sense of the inquiry question: these 6 states spend a fortune more than other on research-related prescriptions, as is made clear in the choropleth map and boxplot below.
choro_finding1.show()
box_finding1.show()
Yes, those outliers on the boxplot are the same 6 troublesome states. Hence, our inquiry question shifted slightly; we already understand that unemployment rate can describe the general overall trend between states and total money spent on research-related drugs, but it was now that we started to investigate what makes states such as California and Texas different. Is there a correlation that we are missing?
As the number of general-care hospitals and military hospitals in a state increases, the amount spent on research-related presciptions also increases.
general_fig.show()
military_fig.show()
When you consider that from the original dataset, less than 1 in 5 entries were related to a teaching hospital (see pie chart below), the first correlation is not as obvious as it may first appear. The increase in research-related payments upon an increase in general care hospitals may be attributed to the fact that hospitals tend to be the first to trial new or experimental drugs.
pie_finding2.show()